11 research outputs found
DifFace: Blind Face Restoration with Diffused Error Contraction
While deep learning-based methods for blind face restoration have achieved
unprecedented success, they still suffer from two major limitations. First,
most of them deteriorate when facing complex degradations out of their training
data. Second, these methods require multiple constraints, e.g., fidelity,
perceptual, and adversarial losses, which require laborious hyper-parameter
tuning to stabilize and balance their influences. In this work, we propose a
novel method named DifFace that is capable of coping with unseen and complex
degradations more gracefully without complicated loss designs. The key of our
method is to establish a posterior distribution from the observed low-quality
(LQ) image to its high-quality (HQ) counterpart. In particular, we design a
transition distribution from the LQ image to the intermediate state of a
pre-trained diffusion model and then gradually transmit from this intermediate
state to the HQ target by recursively applying a pre-trained diffusion model.
The transition distribution only relies on a restoration backbone that is
trained with loss on some synthetic data, which favorably avoids the
cumbersome training process in existing methods. Moreover, the transition
distribution can contract the error of the restoration backbone and thus makes
our method more robust to unknown degradations. Comprehensive experiments show
that DifFace is superior to current state-of-the-art methods, especially in
cases with severe degradations. Our code and model are available at
https://github.com/zsyOAOA/DifFace.Comment: 21 page
ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting
Diffusion-based image super-resolution (SR) methods are mainly limited by the
low inference speed due to the requirements of hundreds or even thousands of
sampling steps. Existing acceleration sampling techniques inevitably sacrifice
performance to some extent, leading to over-blurry SR results. To address this
issue, we propose a novel and efficient diffusion model for SR that
significantly reduces the number of diffusion steps, thereby eliminating the
need for post-acceleration during inference and its associated performance
deterioration. Our method constructs a Markov chain that transfers between the
high-resolution image and the low-resolution image by shifting the residual
between them, substantially improving the transition efficiency. Additionally,
an elaborate noise schedule is developed to flexibly control the shifting speed
and the noise strength during the diffusion process. Extensive experiments
demonstrate that the proposed method obtains superior or at least comparable
performance to current state-of-the-art methods on both synthetic and
real-world datasets, even only with 15 sampling steps. Our code and model are
available at https://github.com/zsyOAOA/ResShift.Comment: 17 pages, 7 figure
Variational Denoising Network: Toward Blind Noise Modeling and Removal
Blind image denoising is an important yet very challenging problem in
computer vision due to the complicated acquisition process of real images. In
this work we propose a new variational inference method, which integrates both
noise estimation and image denoising into a unique Bayesian framework, for
blind image denoising. Specifically, an approximate posterior, parameterized by
deep neural networks, is presented by taking the intrinsic clean image and
noise variances as latent variables conditioned on the input noisy image. This
posterior provides explicit parametric forms for all its involved
hyper-parameters, and thus can be easily implemented for blind image denoising
with automatic noise estimation for the test noisy image. On one hand, as other
data-driven deep learning methods, our method, namely variational denoising
network (VDN), can perform denoising efficiently due to its explicit form of
posterior expression. On the other hand, VDN inherits the advantages of
traditional model-driven approaches, especially the good generalization
capability of generative models. VDN has good interpretability and can be
flexibly utilized to estimate and remove complicated non-i.i.d. noise collected
in real scenarios. Comprehensive experiments are performed to substantiate the
superiority of our method in blind image denoising.Comment: 11 pages, 4 figure
Hyperspectral Image Restoration under Complex Multi-Band Noises
Hyperspectral images (HSIs) are always corrupted by complicated forms of noise during the acquisition process, such as Gaussian noise, impulse noise, stripes, deadlines and so on. Specifically, different bands of the practical HSIs generally contain different noises of evidently distinct type and extent. While current HSI restoration methods give less consideration to such band-noise-distinctness issues, this study elaborately constructs a new HSI restoration technique, aimed at more faithfully and comprehensively taking such noise characteristics into account. Particularly, through a two-level hierarchical Dirichlet process (HDP) to model the HSI noise structure, the noise of each band is depicted by a Dirichlet process Gaussian mixture model (DP-GMM), in which its complexity can be flexibly adapted in an automatic manner. Besides, the DP-GMM of each band comes from a higher level DP-GMM that relates the noise of different bands. The variational Bayes algorithm is also designed to solve this model, and closed-form updating equations for all involved parameters are deduced. The experiment indicates that, in terms of the mean peak signal-to-noise ratio (MPSNR), the proposed method is on average 1 dB higher compared with the existing state-of-the-art methods, as well as performing better in terms of the mean structural similarity index (MSSIM) and Erreur Relative Globale Adimensionnelle de Synthèse (ERGAS)
Unsupervised Pansharpening via Low-rank Diffusion Model
Pansharpening is a process of merging a highresolution panchromatic (PAN)
image and a low-resolution multispectral (LRMS) image to create a single
high-resolution multispectral (HRMS) image. Most of the existing deep
learningbased pansharpening methods have poor generalization ability and the
traditional model-based pansharpening methods need careful manual exploration
for the image structure prior. To alleviate these issues, this paper proposes
an unsupervised pansharpening method by combining the diffusion model with the
low-rank matrix factorization technique. Specifically, we assume that the HRMS
image is decomposed into the product of two low-rank tensors, i.e., the base
tensor and the coefficient matrix. The base tensor lies on the image field and
has low spectral dimension, we can thus conveniently utilize a pre-trained
remote sensing diffusion model to capture its image structures. Additionally,
we derive a simple yet quite effective way to preestimate the coefficient
matrix from the observed LRMS image, which preserves the spectral information
of the HRMS. Extensive experimental results on some benchmark datasets
demonstrate that our proposed method performs better than traditional
model-based approaches and has better generalization ability than deep
learning-based techniques. The code is released in
https://github.com/xyrui/PLRDiff
Exploiting Diffusion Prior for Real-World Image Super-Resolution
We present a novel approach to leverage prior knowledge encapsulated in
pre-trained text-to-image diffusion models for blind super-resolution (SR).
Specifically, by employing our time-aware encoder, we can achieve promising
restoration results without altering the pre-trained synthesis model, thereby
preserving the generative prior and minimizing training cost. To remedy the
loss of fidelity caused by the inherent stochasticity of diffusion models, we
introduce a controllable feature wrapping module that allows users to balance
quality and fidelity by simply adjusting a scalar value during the inference
process. Moreover, we develop a progressive aggregation sampling strategy to
overcome the fixed-size constraints of pre-trained diffusion models, enabling
adaptation to resolutions of any size. A comprehensive evaluation of our method
using both synthetic and real-world benchmarks demonstrates its superiority
over current state-of-the-art approaches.Comment: Project page: https://iceclear.github.io/projects/stablesr